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This paper addresses some of the under- 
lying statistical assumptions and issues 
in the collection and refinement of powder 
diffraction data. While standard data 
collection and Rietveld analysis have 
been extremely successful in providing 
structural information on a vast range of 
materials, there is often uncertainty about 
the true accuracy of the derived structural 
parameters. In this paper, we discuss 
a number of topics concerning data 
collection and the statistics of data 
analysis. We present a simple new 
function, the cumulative chi-squared 
distribution, for assessing regions of mis- 
fit in a diffraction pattern and introduce 
a matrix which relates the impact of 
individual points in a powder diffraction 
pattern with improvements in the 
estimated standard deviation of refined 



parameters. From an experimental view- 
point, we emphasise the importance of not 
over-counting at low-angles and the rou- 
tine use of a variable counting scheme for 
data collection. Data analysis issues are 
discussed within the framework of maxi- 
mum likehhood, which incorporates the 
current least-squares strategies but also 
enables the impact of systematic uncer- 
tainties in both observed and calculated 
data to be reduced. 
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1. Introduction 



We can improve the quality of the structural results 
obtained from a powder diffraction pattern by a number 
of means. Firstly and most importantly, sufficient 
care should be taken in performing a good experiment 
and the observed diffraction data should be as free 
from systematic errors as possible. Due attention 
should be given to all parts of the diffraction pattern. 
The relative importance of, for example, low- and 
high-angle regions of a diffraction pattern should 
be assessed before performing the experiment and 
consideration paid to the balance of data collection 
statistics across the diffraction pattern. With structure 
solution and refinement from x-ray powder diffraction 
data, we stress the importance of a variable counting 
scheme that puts substantially increased weight on the 



high-angle reflections and explain why over-counting 
low-angle reflections can be deleterious to obtaining 
accurate structural parameters. 

After determining the best data collection protocol, 
the next consideration for obtaining good quality 
structural results is ensuring that the calculated 
diffraction pattern is modelled well. For example, 
a good understanding of the profile line shape 
through a fundamental parameters technique pays 
dividends in obtaining a good fit to the Bragg peak 
shape. 

On first thought, it might be expected that the 
combination of a careful experiment followed by 
carefiil modelling of the diffraction data is all that 
needs be considered to obtain good structural informa- 
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tion. However, there is an important third facet that is 
rarely actively considered and indeed generally taken 
for granted — the algorithm behind fitting the model to 
the data. We generally assume that least-squares analy- 
sis is sufficient and indeed it is often so. However, 
least-squares is usually employed "because that's the 
way it has always been done" rather than because of a 
positive consideration of its applicability. This mirrors 
the experimental situation mentioned earlier where 
constant-time data-collection approaches are still often 
preferred over variable counting-time strategies despite 
the fact that it has been known for at least a decade that 
the latter procedure gives better, more accurate results 
for x-ray powder diffraction data [1,2]. 

The underlying principles of probability theory 
indicate that least-squares analysis is appropriate only 
if (i) the data points have an associated Gaussian error 
distribution and (ii) the proposed model is a complete 
representation of the observed data. Although these 
conditions appear to be rather restrictive, they are 
nevertheless broadly satisfied in most Rietveld analy- 
ses. One exception to standard least-squares analysis 
that was discussed several years ago is the situation 
where the counts per data point are low (<20) and 
followed a Poisson rather than a Gaussian distribution. 
Antoniadis et al. showed that a maximum likelihood 
refinement with due account given to Poisson counting 
statistics was the correct approach [3]. Indeed, maxi- 
mum likelihood and Bayesian probability theory offer 
the correct formalism for considering all data and 
model uncertainties; least-squares analysis is just one, 
albeit relatively general, instance of maximum likeli- 
hood. Careful consideration of the physical origins of 
uncertainties in either data errors or insufficiencies in 
the structural model leads to probability distribution 
functions that must be optimised through maximum 
likehhood methods. 

The fundamental statistics approach that looks for a 
physical understanding of the uncertainties in a powder 
diffraction pattern is in many ways analogous to the 
fundamental parameters approach used in peak shape 
analysis. Both methods of analysis lead to more reliable 
results. In this paper, several applications of maximum 
likelihood that go beyond least-squares analysis are 
discussed. These include dealing with unknown 
systematic errors in the data, unattributable impurity 
phases and incomplete structural model descriptions. 



2. Assessing the Quality of a Rietveld 
Refinement 

Before considering how we can optimise our chances 
of success using improved data collections methods or 
alternative statistical approaches, it is worth bench- 
marking the statistical quality of the Rietveld fit to a 
powder diffraction pattern. The conventional goodness- 
of-fit quantities used in the Rietveld method are the 
standard 7?-factors and x^ quantities. The following 
four 7?-factors are generally quoted in most Rietveld 
refinement programs: 



expected 7?-factor: 



R^=MN-P+C)l 

weighted profile 7?-factor: 



I:^;>v 



t-i 



(la) 



profile 7?-factor; 



(lb) 



Bragg 7?-factor: 



(Ic) 



^B=Ji(/r-/rHI(^^*)M^'^^ 



The expected 7?-factor is basically as good as the 
weighted profile 7?-factor can get since the weighted 
sum of the squares of the difference between observed 

N 

and calculated profile values, X^/ (^^ ~^i ) ' can at 

best be equal to the number of independent data, 
(N-P+C), in the diffraction pattern since each weighted 
squared profile difference in a best fit to the data should 
be equal to unity. In a standard x-ray powder diffraction 
pattern, the weight, w^ , is equal to 1/y, . Since A^ is 
generally much larger that either P or C, then the 
expected profile 7?-factor can be rewritten as 



»l/^/(^. 



'(2) 
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The expected profile 7?-factor is thus equal to the 
reciprocal of the square root of the average value of the 
profile points. A small expected profile 7?-factor is 
simply a statement about quantity and means that the 
average number of counts in a profile is large — it bears 
no relationship to the quality of a profile fit. In particu- 
lar, if the diffraction pattern consists of weak peaks on 
top of a high background, then the expected 7?-factor 
can be very low. For an average background count of 
10 000, for example, the expected 7?-factor will be 1 % 
or lower irrespective of the height of the Bragg peaks. 
This has led to a preference for quoting background- 
subtracted (b-s) 7?-factors, 



(b-s) expected 7?-factor: 



R. 



(b-s)E 



{N-P + C}/ 



X-^fU-^y 



(3a) 



(b-s) weighted profile 7?-factor: 



^- 



(b-s)wP 






(3b) 



The (b-s) expected 7?-factor gives a much more 
realistic measure of the quality of the data 



(^ (b-s)E ^ ^^^|[(y-bf^y\ ) and the (b-s) weighted 

7?-factor to both the quality of the data and the quality 
of the fit to the data. However, even still there are 
caveats. Very fine profile steps in a diffraction pattern 
lead to higher expected 7?-factors. For a given diffrac- 
tion pattern, doubling the step size (i.e., grouping points 
together in pairs) will lead to an expected 7?-factor that 
is roughly ^2 smaller than before. Additionally, 
7?-factors may also be quoted for either the full profile 
or only those profile points that contribute to Bragg 
peaks. In themselves, therefore, profile 7?-factors 
treated individually are at best indicators of the quality 
of the data and the fit to the data. However, the ratio of 
weighted profile to expected profile 7?-factors is a good 
measure of how well the data are fitted. Indeed, the 
normalised x ^ function is simply the square of the ratio 



7 = 1 

(^wP /^e)^ = (^-s)wP / ^b-s)E y 



(4) 



(Note that the 7?-factor ratio holds whether or not the 
background has been subtracted in the calculation of 



the 7?-factor. The x ^ value will change, however, if only 
those points that contribute to Bragg peaks are consid- 
ered instead of the full diffraction pattern.) 

Bragg 7?-factors are quoted as an indicator of the 
quality of the fit between observed and calculated 
integrated intensities. It has been shown that the correct 
integrated intensity 7?-factor can be obtained from a 
Pawley or Le Bail analysis [4] where the extracted 
"clumped" integrated intensities, (Jj,) = E (4), are 
matched against the calculated "clumped" intensities, 
J;, = E 4, through the following equations: 



expected T^j-factor: 



^(.)E=j(A^c...p-A^. + CJ/ X S f^/^^W^X-^^) 



/! = 1 k=l 



(5a) 



7?i- factor: 



h=i k=i \ [ It =\ k =1 , 



(5b) 



where a "clump" is a group of completely overlapped 
reflections and the weight matrix W},}, is the associated 
Hessian matrix from the Pawley analysis. It is easily 
shown that 

i 

where j9 (x-x^) is the normalised peak shape for reflec- 
tion k which is situated at X/, These weights are calcu- 
lated as part of the Pawley analysis but are easily 
calculated independently and therefore the above 
7?-factors may also be derived from a Le Bail analysis. 
The integrated intensities ;t^ ^ is again simply the square 
of the ratio of weighted and expected 7?-factors: 

k=\ k=i {p) 

There is a strong argument that the estimated 
standard deviations of the structural parameters 
obtained from a Rietveld analysis should be multiplied 
by the square root of this x ^ function rather than, as is 
conventional, the square root of the Rietveld x ^- This 
usually leads to an additional inflation of between a 
factor of 2 and 4 for the estimate of the standard 
deviations of the structural parameters [4]. 
Interestingly, Xi^ can be evaluated indirectly from a 
combination of Rietveld and Pawley analyses on a 
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dataset. Within statistical errors the numerator of the 
Rietveld ;t^^ function (i.e., the unnormalised Rietveld ;t^^ 
function) is equal to the sum of the unnormalised 
Pawley and integrated intensity x'^ functions [4], i.e., 



i i 

\^iujijp Uump 



(7) 



In this section, we have shown that there are a 
plethora of 7?-factors and % ^ functions that may be used 
to evaluate the quality of and the quality of fit to a 
powder diffraction pattern. Probably the most useful set 
of quantities to use are the following: 

• the background-subtracted, expected profile 
7?-factors evaluated over (a) full profile and 
(b) Bragg peaks only (two quantities) 

• the background- subtracted, weighted profile 
Rietveld and Pawley (or Le Bail) 7?-factors 
evaluated over (a) full profile and (b) Bragg peaks 
only (four quantities) 

• the Rietveld and Pawley (or Le Bail) % ^ functions 
evaluated over (a) full profile and (b) Bragg peaks 
only (two quantities) 

• the expected and weighted integrated intensity 
7?-factors and associated ;t^ ^ (three quantities) 

These quantities together give an indication of how 
well the profile data are fitted using (a) only the unit 
cell, peak shape and other profile parameters 
(Pawley/Le Bail quantities) and (b) a structural model 
(Rietveld quantities). The quantities associated with the 
integrated intensities allow a broad comparison to be 
made with single crystal results. 

As a final point in the discussion of 7?-factors, it is 
worth noting that while expected Rietveld 7?-factors 
will always improve with additional counting time, t, 
(indeed it is straightforward to show from Eq. (2) 
that 7j oci/V/) ^1^^ weighted profile 7?-factor bottoms 
out at a constant value that does not improve with time. 
This happens because the model cannot fit the data any 
better and it is systematic errors that are dominating the 
misfit. Indeed, David and Ibberson have shown that 
counting times are often an order of magnitude longer 



than necessary and that most datasets are probably 
over-counted — these conclusions corroborate earlier 
work by Baharie and Pawley [5,6]. 

3, The Cumulative ;|f^ Distribution 

In the previous section, we showed that the Rietveld 
X ^ function was a good measure of the quality of fit to 
a powder diffraction pattern. Examining, Eq. (4), it 
is clear that ;t^ ^ is the weighted sum of the squares of 
the difference between observed and calculated 
powder diffraction patterns. An auxiliary plot of the 
"difference/esd" underneath a fitted powder diffraction 
pattern gives a good idea of where the pattern is fitted 
well and where it is fitted poorly. Figure 1 a shows the 
fitted diffraction pattern for cimetidine collected on 
station 2.3 at Daresbury. From the "difference/esd" 
plot, regions of misfit can clearly be seen around some 
of the strongest Bragg peaks between IT" and 2T. 
However, the "difference/esd" plot only gives a qualita- 
tive impression of how poor the fit is, even when the 
plot of the diffraction pattern is expanded (Fig. lb). To 
assess the impact of a Bragg peak or a region of the 
diffraction pattern on the overall fit to the data, we need 
to assess the cumulative impact over that region. This 
can be achieved by plotting the cumulative chi-squared 
function which is the weighted sum of the squares of 
the difference between observed and calculated powder 
diffraction patterns up to that point in the diffraction 
pattern. The cumulative chi-squared function at the n\h 
point in the diffraction pattern is given by 



Xl=%^^{.h-M,)'l{,N-P+C). 



(8) 



Examination of Fig. Ic shows that this function gives 
a clear indication of where the principal areas of misfit 
are in the powder diffraction pattern of cimetidine. The 
region from 22"" and 24"" is indeed the worst area of 
profile fit in the powder diffraction pattern. Around one 
third of the total x'^ value is attributable to this 
small region. Moreover, the first half of the pattern 
contributes to -17/19 = 90% of the total misfitting. 
The cumulative chi-squared plot clearly highlights the 
problems in fitting the cimetidine data and provides 
pointers to improving the fit to the data and hence 
obtaining an improved more accurate structural model. 
Indeed, there are three directions that we can take to 
improve the quality of profile fitting: 
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Fig. 1. Observed and calculated diffraction pattern of cimetidine. Tick marks indicate the positions of Bragg peaks while 
the lower panel graph shows the difference/esd (the dotted lines represent ±3 a (a) the full diffraction pattern (b) expanded 
range between 20° and 30° highlighting the region of major misfitting (c) the full diffraction pattern along with the 
cumulative chi-squared distribution. 



Ill 



Volume 1 09, Number 1 , January-February 2004 

Journal of Research of the National Institute of Standards and Technology 




2 theta 
c 

Fig. 1. Observed and calculated diffraction pattern of cimetidine. Tick marks indicate the positions of Bragg peaks while 
the lower panel graph shows the difference/esd (the dotted lines represent ±3 a (a) the full diffraction pattern (b) expanded 
range between 20° and 30° highlighting the region of major misfitting (c) the full diffraction pattern along with the 
cumulative chi-squared distribution — continued. 



(i) redo the experiment to count for shorter times 
at low two-theta values and for longer at higher 
two-theta values. This will reduce the cumulative 
X^ contribution in the 22"" and 2T region and 
up-weight the well-fitted high angle data (see 
Sec. 4.1). 

(ii) develop an improved model to describe the dif- 
fraction pattern — a good example of this might 
be the inclusion of anisotropic line broadening. 

(iii) downweight the regions of misfit if it proves 
difficult to obtain a simple model. (In the 22"" and 
24"" region, the misfitting may occur as a con- 
sequence of disorder diffuse scattering — ^most 
codes do not include this effect.) In such cases, 
downweighting the misfitting points appropriate- 
ly will lead to improved, less biased structural 
parameters (see Sec. 5.1 and Ref [7]). 



4. Assessing the Impact of Specific 
Regions of a Powder Diffraction 
Pattern 

In the previous section, we discussed global meas- 
ures of the quality of a Rietveld fit to a powder diffrac- 
tion pattern. Ideally, we would like to be able to go 
fiirther and devise an optimal methodology for collect- 
ing diffraction data. What parts of a powder diffraction 
pattern have the maximum impact on improving the 
quality of a crystal structure refinement? What parts of 
a diffraction pattern, for example, contribute most to 
the precise determination of anisotropic displacement 
parameters? The intuitive answer is that high angle 
reflections will be the most important but peak overlap 
will reduce this impact. In fact, both low and high 
angles regions (but, to a lesser extent, intermediate 
regions) are generally important. The counterintuitive 
importance of the low angle reflections results from the 
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correlation of anisotropic displacement parameters 
with the scale factor. How does one then assess the 
impact of a single point in a diffraction pattern on the 
precision of a particular structural parameter? Prince 
and Nicholson showed for single crystal diffraction that 
the impact of individual reflections may be assessed 
statistically using standard least squares analysis [8]. 
Their procedure is easily extended to powder diffrac- 
tion data [9]. 

The covariance matrix, V, obtained fi*om Rietveld 
analysis is the best measure of the precision and corre- 
lation of the refined parameters, pjj = I, . . ., N^^, from 
a powder diffraction pattern containing A^^^s points; x„ y^ 
and ei are, respectively, the position profile value and 
estimated standard deviation of the /th point in the 
pattern which is modelled by a function value, M^. The 
covariance matrix, V, is the inverse of the Hessian 
matrix, H, which may be expressed as H = A^wA 
where the elements of ^ are A^j = dM^ I dpj and w is the 
weight matrix which is usually diagonal with elements 
Wii= 1/g^. Forming the matrix Z with elements 
Zy = (l/o;) dM^ / dpj means that the Hessian matrix may 
be written as H = Z ^Z. From this Z matrix, the projec- 
tion matrix, /*, may be formed from the equation 
/> = Z(Z^Z)"^Z^[8]. This matrix, although not often 
discussed in least squares analysis, has a number of 
important properties. Most notably, the on-diagonal 
element, P,,, is the leverage of a data point and has a 
value between zero and one. A high leverage means that 
a data point plays an important role in the overall model 
fitting and vice-versa. There is, however, another 
significant quantity for the analysis of the variance of a 
particular parameter. 

Consider the impact on a particular element V,, of the 
covariance matrix if the /th data point is collected for a 
fraction a, longer. The Hessian matrix is modified as 
follows: H' = H-\-aizJz where the row vector z has 
elements z, = (l/o;) 371^ / 3^^ . Since the Hessian and 
covariance matrices are the inverses of each another, 
the change in the covariance matrix may be shown 
to be 



r = F-a.(Fz^zK)/(H-a,zVz) 



(9) 



This equation may be simplified when it is recog- 
nised that zJVz = Pir Putting the vector t = zV implies 
that (Vz^zV)rs = (zVft(zV)s = Vs ^^^ thus, as long as ais 
small, all the elements of the parameter covariance 
matrix are altered as follows: 

v: = Vrs -«,ao/(i +«,4) = K -ocM ■ (10) 



The product t^^ is thus a measure of the impact of the 
ith point on element rs of the covariance matrix. In 
particular, t^ is a measure of the importance of the ith 
data point on the jth parameter; a large value of t^ 
leads to a substantial reduction in the parameter 
variance and a concomitant improvement in precision. 
The quantity 






(11) 



is perhaps more informative than its square as it 
provides information about the sense of the ith data 
point contribution to the covariance terms. Its relation- 
ship to the covariance matrix is essentially identical 
to the relationship between the residual (observed- 
calculated)/(estimated standard deviation) and the 
overall x ^ goodness of fit. A specific example^ of the 
use of the ^matrix to determine the significance of 
different parts of a powder diffraction is discussed in 
Ref [9]. 

4.1 Variable Counting Time Protocols for X-Ray 
Powder Diffraction Data Collection 

The use of tXi) as a diagnostic for determining accu- 
rate structural parameters depends on whether we 
believe that the errors in our data are well understood 
or not. If we are sure that the sources of the errors in our 
data are all known — the simplest case is the belief that 
the only sources of uncertainty are from counting sta- 
tistics — then we will target those points in the diffrac- 
tion pattern that have the maximum values of t,(i) since 
these will be the points that reduce the estimated stan- 
dard deviations of a parameter by the greatest amount. 
It is intuitively obvious that we will get the most pre- 
cise assessment of the area of a peak by counting for 
longest at the top of the peak and that we will get the 
best indication of the peak position by counting at the 
points of maximum gradient change on the peak. These 
conclusions, however, do depend on us knowing with 
complete confidence what the peak shape is. This point, 
in turn, means that we can only use these maximum 
impact points if we not only know that source of all our 
experimental errors but also have complete confidence 
in our model. While this may often be true for neutron 
powder diffraction data, it is generally not the case for 



1 This example concerns the analysis of orientational order in C^q 
from neutron powder diffraction data. The /-matrix is used to show 
that the deviations from spherical symmetry of the orientation distri- 
bution function of Cgp in the high temperature phase can be well 
modelled using neutron powder diffraction data and that powder 
averaging is quite different from spherical averaging. 
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x-ray diffraction and patterns such as those shown for 
cimetidine in Fig. 1 are the norm rather than the excep- 
tion. If we were entirely confident about the sources of 
misfit in our low-angle diffraction data then we would 
count for longer at low angles since this offers the 
prospects of reducing the terms in the covariance 
matrix by the largest amount. If we are uncertain about 
our data errors and the sufficiency of our model then we 
have to take an alternative approach to the problem that 
is effectively opposite to the argument when the errors 
are known. If we have an intense Bragg peak at low 
angles and are uncertain about our errors then t^i) tells 
us that the variance terms will reduce substantially but 
unfortunately in an incorrect way. We will have a more 
precise result but a less accurate one. Indeed, as the 
variance terms reduce, we will be faced with a result 
that may be increasingly more precise while at the same 
time decreasingly accurate. To obtain accurate results 
in the face of uncertain errors, our best approach is to 
distribute the errors as evenly as possible across all the 
Bragg peaks. This means counting for substantially 
longer at higher angles. There are two published 
methods for deciding how to vary the counting time 
across the diffraction pattern [1,4,10]. Both approaches 
lead to essentially identical protocols and also both lead 
to the important conclusion that higher angle parts of 
the diffraction pattern may have to counted for often 
more than 30 times longer than low-angle regions. In 
order to explain the rationale for longer counting times, 
we follow the approach of David [4] and Shankland, 
David and Si via [10] that was developed with a view to 
improving the chances of structure solution. The ration- 
ale is based upon one of the central formulae of Direct 
methods, the tangent formula which determines 
the probable relationship between the phases, (p(h), 
(p(k) and (p(h-k): 

Y,^E{h)E{k)E{h-k)sm[(p{k)+(p{h-k)\ 
Y^^E{h)E{k)E{]t-k)zo^\(p{kY(p{]t-ky[ 

k <^2 

(12) 

where a^ =^[y^(|^|=o)]"and the normalised structure 
factor, E{h\ is related to the integrated intensity, / [(^)] = 

j{h)\F (h) 'I by the equation \E{h)\' = I{h)l Y,S^h). ^ 

7=1 



We simply require that the fractional error in E{h) 
should be independent of where the reflection is in the 
diffraction pattern. This, in turn, leads to the fact that all 
components of the summations in the tangent formulae 
will on average be determined with equal precision. 
When we collect a powder diffraction pattern, the 
Bragg peak area, A{h), is not the integrated intensity 
itself but is modified by geometrical, absorption and 
extinction terms. If we know that absorption and 
extinction effects are severe, then we should include 
their effects in evaluating the variable collection strate- 
gy. However, if we work under the simpler assumption 
that these effects are small, then A (h) = L^I (h), where 
Lp is the Lorentz polarisation correction and we will 
count normalised structure factors, E (h), with equal 
precision across a powder diffraction pattern if we off- 
set the combined effects of Zp, the form- factor fall-off 
and the Debye-Waller effects of thermal motion, i.e., 

t(20) ocl/L^(20)^ g%20) where we have exphcitly 

used a 2-theta dependence. For the case of Bragg- 
Brentano geometry on a laboratory-based x-ray powder 
diffractometer, this becomes 



m^ 



(sin0sin20)(l+cos^2a) 



(l + cos'2acos'20)/J(0>xp(-25>n ^O/X") 

(13a) 



E(h) = J^gj(h)Qxp(2mh-r) andgj(h) = fj{K)Qxp{-BJ Ad ^ . 



where f^^ is a representative atomic scattering factor 
(e.g., carbon), ^^v is an estimated overall Debye-Waller 
factor, X is the incident wavelength and 2 a is the mono- 
chromator take-off angle. For the case of Debye- 
Scherrer geometry on a synchrotron x-ray powder 
diffractometer, this simplifies to 

te oc (sine sin 6>)/[/^'^ (6>)exp (-25>n ^QIX ')] . (13b) 

The variable counting time scheme for these two 
typical diffractometer settings are shown in Fig. 2. Both 
laboratory and synchrotron variations show that the 
counting times at intermediate angles should be sub- 
stantially longer than at low-angles and extreme 
backscattering. Interestingly, the 2-theta variations of 
the variable counting time schemes are dominated as 
much by the Lorentz polarisation correction as the 
form- factor fall-off and Debye-Waller variation. Indeed 
at low-angles, the principal effects are associated with 
the Lorentz polarisation correction. All three effects 
combine together to create a substantial variation in 
counting time as a fimction of 2-theta. Figure 3 com- 
pares the constant counting time pattern (Fig. 3a) com- 
pared with the raw counts using the variable counting 
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V 100 




Fig. 2. Variable couming time schemes for both laboratory and 
synchrotron diffractometers. The dilation is normalised to be unity 
for 20= 10". 



time protocol (Fig. 3b) for the drug compound, 
chlorothiazide. The Bragg peaks at high angle appear 
to be of the same intensity as the low-angle reflections 
— all the Bragg peaks in this diffraction pattern have 
been reliably determined. This proved crucial in the 
successful structure solution of the compound using 
Direct methods as large numbers of reliable triplet 
phase relationships could be formed [10]. A further 
indication of the importance of using a variable count- 
ing time scheme can be seen from the analysis of the 
cumulative chi-squared distribution for the refinement 
of the structure of famotidine (Figure 4). The overall 
chi-squared is low (-1.6) showing that a good fit has 
been achieved over the full diffraction pattern. 
Moreover, the cumulative chi-squared distribution 
forms an essentially straight line over the full pattern 
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Fig. 3. Raw and normalised counts for synchrotron powder diffraction data of chlorothiazide. The inset shows the variable 
counting scheme used. 
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Fig. 4. The cumulative chi-squared distribution for famotidine overlaid upon the synchrotron powder diffraction pattern. 
The benefits of the variable counting time scheme are clear as the impact of all regions of the pattern are similar. 



indicating that all regions are fitted equally well and, as 
a corollary, that the errors are also even distributed over 
all the reflections. This is an important point as it 
follows from this that the effects of systematic errors 
must be substantially diminished over, for example, the 
case of cimetidine (see Fig. Ic). 

5. Beyond Least-Squares Analysis 

In the previous sections, we discussed from a statis- 
tical point of view how to assess the limitations of a 
Rietveld analysis and overcome these problems 
through the use of, for example, variable counting time 
protocols. What happens when we still have areas of 
the diffraction pattern that are not fitted well despite 
performing a careful experiment? If the misfit results 
from additional scattering from an unattributed impuri- 
ty phase then we can formulate this within the context 
of Bayesian probability theory and develop an appro- 
priate refinement procedure. If we have no real idea 
what has caused the misfitting — it may, for example, be 
lineshape effects, imperfect powder statistics or diffuse 
scattering — then we have to develop a catch-all proba- 
bilistic procedure for addressing this problem. If the 
misfitting involves a small proportion of the data, then 
we can develop a robust method of improving the accu- 
racy of our results. At the same time, however, our 



precision decreases because we have allowed the possi- 
bility of more sources of uncertainty than in a standard 
least-squares analysis. The approach used in this paper 
follows that of Sivia who aptly discussed the problem 
as one of "dealing with duff data" [11]. 

5.1 Dealing With Duff Data 

When we observe misfitting in a powder diffraction 
pattern, our first assumption is that the structural model 
that we have used to describe the data is not quite opti- 
mised. However, we often find that despite our best 
attempts, the data never fit well across the full diffrac- 
tion pattern and we are left with regions of misfit that 
may well be introducing systematic errors into our data. 
If we understand the source of this misfit — it may for 
example be an unattributable impurity phase — then we 
may be able to develop a suitably specific maximum 
likelihood refinement protocol. However, when we are 
unable to postulate a suitable explanation for misfitting, 
then we must develop a very general probabilistic 
approach, as has been done previously [11, 12]. If we 
take a standard point in our diffraction pattern that has, 
say, 400 counts we know from Gaussian counting 
statistics that our expected standard deviation will be 
around 20 counts. If we proceed through to the end of 
our least squares analysis with this assumption, then we 
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are making a very definite statement about our errors. 
We are saying categorically that we know all the 
sources of our errors and that they results only from 
counting statistics. Put in these terms, this is a bold 
assertion. Fortunately in most Rietveld analyses (and 
particularly in the area of neutron powder diffraction) 
this is a fair statement to make. However, we will show 
that even with good refinements, we can improve our 
accuracy (at the expense of some precision) by using a 
more robust algorithm. 

One of the things that we can say for sure when we 
have collected a point in our diffraction pattern with 
fi = 400 counts is that the uncertainty in our measure- 
ment cannot be less than 20 counts — but it could be 
more. We must generate a probability distribution for 
our uncertainty — after all, we are no longer certain 
about our uncertainties. A good distribution, because it 
has the properties of scale invariance, is the Jeffrey's 
distribution, l/o", for all values a>^. This proba- 
bility distribution for our uncertainty is shown in 
Fig. 5a. The corresponding likelihood for the data is 
obtained by integrating over this distribution 



p(D\fi,a>^) = 
j prob(a) — 7=exp 



=^/^ 



-^<.D-,f 



da 



(14) 



which leads, not to a Gaussian likelihood but an error- 
function distribution 
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(15) 



This is shown in Fig. 5b. The negative log-likelihood, 
which gives a direct comparison with the least-squares 
distribution, is shown in Fig. 5c. For large positive and 
negative deviations between observed and calculated 
data, the penalty no longer follows a quadratic form but 
rather a logarithmic distribution. Large deviations have 
less impact on this robust modified x ^ function while 
small deviations are treated just like the standard least- 
squares (albeit with a shallower distribution arising 
from our poorer state of knowledge about our uncer- 
tainties). 

We illustrate the use of this robust statistic for the 
case of a high resolution x-ray powder diffraction 
pattern of urea collected on BM16 at the ESRF, 
Grenoble. Standard least-squares analysis leads to a 
satisfactory weighted profile ;t^^ of -3.7. However, 
examination of the cumulative ;t^ ^ plot (Fig. 6), shows 
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Fig. 5. Robust least squares, (a) the probability distribution func- 
tion associated with using the counting statistics error as a lower 
uncertainty bound and a scale-invariant Jeffrey's prior to represent 
the degree of ignorance of other errors, (b) the standard least- 
squares likehhood (dotted line) compared with the robust likelihood 
(dashed line) derived from the probability distribution function 
shown in Fig. 5a, (c) the negative log-likelihood (or chi-squared 
equivalent) for standard least-squares (dotted line) and robust statis- 
tics (dashed line). 



that almost a quarter of the misfit is associated with 
the strongest Bragg peak. This could result from 
preferred orientation, detector saturation or particle 
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Fig. 6. Comparison of the cumulative standard chi-squared function with the cumulative robust chi-squared function for 
urea. The synchrotron powder diffraction pattern of urea is shown in the background. 



Statistics — we don't know. The cumulative robust x'^ 
distribution, on the other hand, contains no such bias 
towards this single peak. Indeed, the linear variation of 
the cumulative robust x'^ distribution across the full 
pattern gives a reassuring degree of confidence to this 
modified least-squares approach. However, a compari- 
son of the structural parameters for the conventional 
and robust least-squares approaches with single crystal 
data convincingly shows the benefits of the robust 
metric which automatically downweights bad data. 
With conventional least-squares, the results are good 
and the estimated standard deviations are small. 
However, nine of the fourteen structural parameters are 
more than four standard deviations different from their 
single crystal counterparts indicating that the accuracy 
of the parameters obtained from the least squares analy- 
sis does not measure up to their precision. On the other 
hand, only one of the structural parameters from the 
robust analysis is more than 4 a away from the corre- 
sponding single crystal value. The parameters changes 
are modest between least-squares and robust analyses. 
However, the differences are real and the improve- 
ments in precision when benchmarked against the 
single crystal parameters are significant. While it is 
dangerous to extrapolate from a single example, the 
underlying statistical framework is sound and suggests 
that, when significant jumps are found in a cumulative 
chi-squared plot, then a robust analysis is worthwhile. 



5.2 Refinement in the Presence of Unattributable 
Impurity Phases 

What do you do when you want to perform a 
Rietveld analysis of a particular material but have a 
substantial impurity phase and despite all your best 
attempts you can neither remove it from your sample 
nor index it from your diffraction pattern? 
Conventional wisdom would state that your chances of 
obtaining unbiased structural parameters are poor and 
that the best you can do is to manually exclude the 
offending impurity peaks. Standard Rietveld programs 
that are based upon a least-squares refinement 
algorithm cannot cope in an unbiased manner with an 
incomplete model description of the data. This is just 
the situation where Bayesian probability theory can 
come to the rescue. We can ask the question, "How do 
I perform a refinement on a powder diffraction pattern 
when I know that there is an impurity phase present but 
have no idea what that impurity phase may be?" This 
question is equivalent to stating that my diffraction 
pattern contains a component that I can model (known 
phases + background) and an additional positive, 
unknown contribution. It turns out that enforcing the 
positivity of the unknown component as an additive 
contribution is sufficient to produce excellent results 
[7]. 
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The mathematical development of these ideas has 
been presented elsewhere and results in a modified % ^ 
goodness of fit function that is shown in Fig. 7 [7,13]. 
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Fig. 7. The modified robust goodness of fit function (solid line) 
compared with the standard quadratic least-squares function. 

For observed data that are less than the model function, 
the new goodness of fit behaves essentially identically 
to the standard % ^. This is to be expected since such 
points are unlikely to be associated with an impurity 
contribution. On the other hand, when the observed 
data value is substantially greater than the fitted model 
value, then the new goodness of fit brings a substantial- 
ly smaller penalty (the function varies logarithmically) 



than the quadratic behaviour of the standard ;f^. Again 
this is just what is required to minimise the impact of 
any impurity phase. Note also that the curvature of the 
new goodness of fit is shallower than the standard % ^. 
This means that quoted standard deviations will be 
higher for refinements using the new goodness of fit. 
This is to be expected as the allowance for an impurity 
phase brings a greater uncertainty into the model 
parameter values. 

Diffraction patterns of yttria and rutile were collect- 
ed on HRPD at ISIS. Results from the 5 % yttria: 95 % 
rutile are shown in Fig. 9. (The fitted diffraction pattern 
of pure yttria is shown in Fig. 8 for comparison.) In 
order to accentuate the difference between the new 
goodness of fit function and standard least-squares 
analysis, we have chosen to refine the minority yttria 
phase treating the majority phase as the impurity (see 
Fig. 9a). The excellent fit to the data for the modified % ^ 
is shown in Fig. 9b where we have graphically down- 
weighted the observed points, which contribute least to 
the goodness of fit. This emphasises what the algorithm 
is effectively doing — large positive (obs-calc)/esd 
values are essentially ignored. In effect, the algorithm 
is optimally excluding those regions that do not 
contribute to the model. The relative calculated peak 
intensities agree very well with the results for pure 
yttria (Fig. 8). Least squares analysis (Fig. 9c) produces 
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Fig. 8. The observed and calculated diffraction patterns for pure yttria determined on HRPD at ISIS. 
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Fig. 9. Observed and calculated diffraction patterns for the composition 5 % yttria : 95 % rutile: (a) robust analysis showing the fUll observed 
data range (the grey scale described in the text not used in this figure); (b) expanded region highlighting the successful robust refinement 
(the down-weighting grey scale is used in this figure); (c) the least-squares analysis showing the poor agreement between the observed and 
calculated patterns. 
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a completely different result — all points are considered 
with no downweighting for possible impurities. The 
first obvious effect is that the refined background is too 
high. The reason for this is obvious since the strong 
impurity peaks lift up the model fit. The relative peak 
intensities are however also very different fi*om the 
correct values suggesting that the refined structural 
parameters are substantially in error. This is indeed 
the case and is borne out by analysis of the refined 
zirconium and oxygen coordinates, which are shown 
graphically in Fig. 10 as a function of yttia content. We 
briefly consider the other refined parameters (a fuller 
analysis is given in Ref [7]). The scale factor is correct 
within estimated standard deviation (esd) for the robust 
analysis but behaves wildly for the standard least 
squares, exceeding 1000 % for 25 % yttria content. The 
least-squares analysis of the lattice constant also 
becomes increasingly unreliable as the refinement 
locks into peaks associated with rutile as well as yttria. 
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Fig. 10. The refined atomic coordinates of yttria plotted as a func- 
tion of yttria composition. Open circles and filled squares correspond 
to the least-squares and robust analyses, respectively, (a) The yttrium 
X coordinate, (b), (c), (d) The oxygen x, y, and z coordinates. The 
dotted lines correspond to the correct values obtained from least- 
squares refinement of the pure-yttria diffraction pattern. 



On the other hand, the lattice constant from the robust 
refinement is satisfyingly stable; the esds increase as 
the yttria content decreases (the 5 % esd is some five 
times larger than the 1 00 % value) but all results lie 
within a standard deviation of the correct result. 

5.3 Summary of Maximum Likelihood 
Refinement Algorithms 

Least-squares Rietveld analysis is the best and least- 
biased method of structure refinement from a powder 
diffraction pattern when the data can be fully modelled. 
However, when there is an unmodelled contribution in 
the diffraction pattern, least-squares analysis gives 
biased results. In the impurity phase example discussed 
in this contribution, significant deviations from the 
correct parameter values occur when there is as little as 
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a 10 % impurity contribution. At higher impurity 
levels, least-squares analysis is completely unreliable. 
These problems may however, be overcome if the exis- 
tence of an unknown impurity contribution is built into 
the refinement algorithm. While it might seem to be a 
logical inconsistency to build in information about an 
unknown contribution, Bayesian probability theory 
provides a framework for doing just this. Only two 
broad assumptions are necessary to derive an appropri- 
ate modified probability distribution fimction. These 
are (i) that the impurity contribution must be intrinsi- 
cally positive and (ii) that its magnitude, A, is unknown 
and thus best modelled by a Jeffreys' prior, given by 
p(A\I) oc \IA for ^ > and/?(^ | 7) = for ^ < 0. This 
produces a modified ";t^^" fimction (see Fig. 1) that 
effectively excludes the impact of impurity peaks. 

The results discussed in briefly in this contribution 
and more extensively in Ref. [13], show that the 
improvement over conventional least-squares analysis 



is dramatic. Indeed, even in the presence of very sub- 
stantial impurity contributions (see Fig. 4) the refined 
structural parameters are within a standard deviation of 
their correct values. 

It must, however, be stated as a final caveat that care 
should be taken with this approach and the use of an 
algorithm that can cope with the presence of impurities 
should be seen as a last resort. Indeed, every effort 
should be made to determine all the phases in a sample. 
It is much more desirable to include the impurity phase 
in a standard Rietveld refinement. 
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Table 1. Structural parameters obtained for urea from single crystal results (column 2) and high-resolution powder diffraction data. Two separate 
analyses were performed on the powder diffraction data. Results from a standard least-squares analysis are shown in column 2 and compared with 
the single crystal results in column 3. The results from the robust analysis are listed in column 5 and compared with the single crystal results in 
the final sixth column. The shaded cells indicate discrepancies that are beyond 4 a 
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